专利摘要:
A multimicroprocessor system constructed of multimicroprocessor structures each including N-number of microprocessor units, a shared memory, an input/output unit and a register exchange circuit. The microprocessor units are uniform and include a microprocessor, a data memory, a parallel input-output interface, a sequential input/output circuit, a program memory and a bi-directional buffer. The buffer connects the internal data bus in the microprocessor unit to the shared instruction bus for the multimicroprocessor structure, and the enable inputs of the buffer are connected to the internal busses for circuit selection in the microprocessor unit by the microprocessor address lines. The address lines of the first microprocessor unit in the multimicroprocessor structure are connected also to a shared memory, input/output unit and both to the parallel data exchange register circuit and the "HALT" inputs of the microprocessors in the rest of the units through a logic circuit, which serves to switch off the microprocessor units. The multimicroprocessor structures are connected therebetween by first level data exchange register circuits, whose control inputs are connected to the address lines of the first microprocessor units in the respective first microprocessor structures. These first level register circuits are connected groupwise to a second level data exchange register circuit, whose control input is connected to the address lines of the microprocessor unit whose address lines are connected to the first of the first level data exchange register circuits.
公开号:SU1420601A1
申请号:SU837772960
申请日:1983-04-26
公开日:1988-08-30
发明作者:Кирилов Касабов Никола
申请人:Вмеи "Ленин" (Инопредприятие);
IPC主号:
专利说明:

(L

tsD
Oi
FIG. f
The invention relates to computing and can be used for parallel processing of information in various specialized classes of tasks: fast Fourier transform, vector and matrix calculus, processing signals received from several sources simultaneously in real time, data processing, obtained as a result of physical and other tests, simultaneous control of several interconnected objects, fast solution of a system of differential and linear equations.
A hierarchical computational system is known, the computational modules of which form a tree structure (see NA Deshmukh, RGScott, PPRoberts A hierarchically structured multi-microprocessors system.- Microprocessors and their applications, No. 13, 1979, PP. 317 -327)
The closest to the technical essence of the invention is a hierarchical computing system containing a group of computing modules, each of which contains a group of control and processing devices and a storage device, and in each computing module of the group, the inputs-outputs of commands of control and processing devices are connected with information input-output storage device (see US patent No. 4245306, CL. G 06 F 15/16 (NCI 364/200), publ. 1981).
The disadvantages of the known systems are the complex organization of control and inter-module communication in such systems and the complexity of the reconfiguration of these systems.
The purpose of the invention is to simplify the system.
The goal is achieved by introducing into the system blocks of regular exchange with corresponding connections, the simplicity of organizing the management of which allows to simplify the whole system.
Figure 1 shows the block diagram of the control unit and the process. processing, block diagram of the microprocessor module; FIG. 2 is a block diagram of a computing module (SIMD type); FIG. 3 is a block diagram of the register exchange unit, FIG. 4
example of a hierarchical computing system; FIG. 3 is an example of a computing system implemented on sixteen computing modules with four control and processing devices each.
The control and processing device 16 (FIG. 1) contains a microprocessor
0 1 block 2 of RAM, block 3 of a programmable parallel interface; ISA, which is connected via input-output 4 to external data sources of the system, and through input-output 4 exchange
5 with internal data sources of the system, block 6 of a programmable serial interface, which is connected via input-output 7 to external data sources of the system, input-output 8 commands, data bus 9, data buffer 10, address bus 11 and control 12, input 13 a clock, an idle mode input 14, and a fixed memory block 15.
5, Computing module 35 (FIG. 2) contains a group of control and processing devices 16, a common command bus 17, a register exchange unit 18, a data bus 19, a permanent memory unit 20, an operational memory unit 21, an input unit 22 - output, block 23, shutdown, input 24 of the exchange of block 18.
The register exchange unit 18 (FIG. 3) contains N registers 25, a switching control unit 26, by means of which the bits of the input 27 of the address in the quantity log. (N 1) are connected via the outputs U, Y 2 ..., V-r of node 26, inputs and outputs of registers 25, the first output of node 26 being connected to the control input of key 28, which connects the output of the first register 25 to the input of the second register 25, the second output U is connected to key 29, which connects the output of the third register 25 to the input of the second register 25, etc. N-3rd output Y ft--} is connected to the key 30, which connects the output of the first register 25 to the input of the last but one register 25, the output Y.g. is connected to the key 31, the connection of the Nth and (K-1) - The first registers, 25, and the output D, are connected to the key 32, which connects the first to the last, registers 25, all registers 25 being connected via information input / output 33 to devices 16 by means of their input 5, and the first one
0
five
0
five
gistr 25 also has an additional ing-g - formational input-output 34,
The hierarchical computing system (FIG. 4) consists of several computing modules 35 connected to several blocks 18, and the inputs-outputs 34 of the first registers 25 in a certain number of modules 35 are connected to blocks 36 of the first level register exchange, the inputs 27 of which are connected to buses The 11 first devices 16 of the first module 35 in the group, the inputs-outputs 34 of the first registers 25 blocks 36 of the register exchange of the first level are connected in groups to the blocks 36 of the register exchange of the second level, the address inputs of which are connected to buses E 11 address of the device 16, which bus 11 connected to the first block 36 of the first layer, etc. At the last hierarchical level of communication between devices 1.6, there is a single register exchange block 37, whose address inputs are connected to the address buses of the first device 16 of the first module 35. In this case, the exchange microprocessor addresses 1 must also contain exchange control addresses in blocks register exchange of each level - from zero to the last, while the remaining devices 16 have a smaller number of such addresses.
The computing system (FIG. 5) consists of sixteen computing modules 35, four devices 16 each, marked from M-O to M-63, with each register exchange block of zero 18 first 36 and second 37 levels having a register 25 , denoted R-0, R-4, ..., R-60, These numbers correspond to the device numbers of the entire system. Blocks 36 and 37 are controlled by address busses of the first microprocessors 1, in a group of this type of system, you can design with a different number of devices 16 in modules 35 and with a different number of registers 25 in blocks 18, 36, 37. The minimum number of devices 16 in module 35 equals two. The regulatory structure is obtained when all modules 35 have two devices each and each register exchange unit has two registers. In this case, the number of levels is
0
five
0
log-n. Connections in systems of this type are similar to a tree structure.
The computing module 35 operates in the following manner.
All microprocessors 1 start at the same starting address found in their program counters — the address of the first instruction of the program, recorded in block 20 of the permanent memory. All microprocessors 1 address the same instruction, but only the first microprocessor 1 actually reads it from block 20, and the instruction code goes through trunk 17 to all microprocessors 1, since the buffers 10 are open. In cases where the instruction contains the operand address of block 2 of RAM 2, each microprocessor 1 executes this instruction with the data located at that address in block 2 and then buffer 10 is turned off, with the result that the link between the mag- 5 by the 17 and the internal bus 9 data does not exist. Each device 16 executes the instruction as an independent microcomputer. By following a certain number of instructions, it may be necessary to exchange data between devices 16. This is achieved as follows. Each microprocessor 1 sends its data through the exchange output 5 to the corresponding register 25 of block 18. This is carried out in parallel with the same sequence of instructions (subroutines), after which a dummy instruction is executed (for example, some comparison, without changing the contents of memory cells). ti), whose address is decrypted by the logic of the stop unit 23 and the Stop signal is transmitted to the inputs 14 of the microprocessor standby mode 1, and the address is transmitted in a similar way, through which it is resolved again through the block 23 o The bmen in block 18 thereafter, the first microprocessor 1 reads and executes dummy instructions, the addresses of which are exchange codes in block 18 until the necessary movement of data in the registers 25 is obtained. Care should be taken that dummy instructions do not change the data in the first microprocessor 1 and if it is not possible to preserve the condition code in advance. It is advisable that these dummy in0
five
0
five
0
five
The structures were short in order to quickly implement the intended exchange. After that, all microprocessors 1 are connected either to the same address where they are stopped due to triggers in block 23, or another method can be used in which they execute a one-word data readout routine from their registers via input-output 5 of block 3, Then you can proceed to follow the instructions or exchange again. The data in blocks 2 of each device 16 can be received from the outside in parallel via inputs / outputs 4, they can be transferred from the general memory block 21 via command bus 17, and during transfer to one device 16 serial data can also be received through the inputs / outputs 7 of the block 6. Clocking is performed from the common system clock generator at the inputs 13.
Block 18 register exchange works as follows.
When transmitting a specific code on the input 27 of the address, one of the following exchange conversions between the registers 25 is implemented:
at.,
BI
12 . 12 .
1234. 2314.
12.
N

N
N N
123 ... N 213 ... N
N
AT;,
 N-1
i
25 (iIMD system) executes its program, which, in particular, may coincide with the program of another module 35. If necessary, all devices 16 in the system can exchange data in an arbitrary way, i.e. described with the help of an arbitrary transformation; the calling of all elements — devices, in that the exchange is carried out as follows. The necessary transformation (assuming this is a permutation) P decomposes into a product of cycles
(12), (123), (12 ... N), where N is the total number of devices 16 in the system, after which the cycles are implemented
Q sequentially, making parallel basic permutations in blocks 36 of the register exchange. For example, if for the system (figure 5) it is necessary to make a permutation of p (O 2 4 ...
35
N-1, N 123 ... N-1
“B, vj,
23 ... N, 1, 234 ... N, N where on the top line are the sequence numbers of registers 25, receiving the contents of the corresponding registers 25 on the bottom line. There are algorithms and programs for detecting the decomposition of an arbitrary 24 26 1 3 ... 25 27 28 29 ... 5354), exchanging the exchange between all N-regions specified as one cycle, and the countries in the sequence in, in ,, not in image quality with two
about i
..., in -basic transformations. For example, if it is necessary for the fourth device 16 to send the content to the first, second and third devices 16, and also to receive data from the first device 1, and the addresses at which the transformations are implemented in, at, Bj, Bj, Bq , are respectively 80, 81, 82, 83, 84, and the microprocessors 1 are turned off by address A73 (all addresses are hexadecimal)
50
55
lines, it can be decomposed by the standard action in the product (01 2 ... 25 26) (O 1 2 .... 53 54). The first permutation is implemented within three cycles (in one cycle, one basic permutation is realized). In the first cycle, permutations are implemented in parallel (O 1 23), (456 7), (8 9 10 11), (12 13 14 15), (16 17 18 19), (20 21 22 23), (24 25 26) - blocks 18 register exchange zero level. In the second
Q 5 2o, the following sequence of instructions is needed, performed by the first device 16 (N 4, the contents of the devices 16 to be exchanged are in their respective registers): FC81, FC84, FC83, since the required exchange can be represented using the transform 1234, 4441, which is represented by the sequence Bj, c, c. Pre
A FCA instruction 73 is transmitted, where the FC code is a dummy instruction code (an instruction that exists, but does not cause any meaningful action in terms of an extreme result). If the transformation is a permutation, its execution will take no more than N - 1, for which there is a simple analytical form and corresponding program.
The computing system (MSIMD-type) works as follows. Each computing module 35
25 (iIMD system) executes its program, which, in particular, may coincide with the program of another module 35. If necessary, all devices 16 in the system can exchange data in an arbitrary way, i.e. described with the help of an arbitrary transformation; the calling of all elements — devices, in that the exchange is carried out as follows. The necessary transformation (assuming this is a permutation) P decomposes into a product of cycles
(12), (123), (12 ... N), where N is the total number of devices 16 in the system, after which the cycles are implemented
Q sequentially, making parallel basic permutations in blocks 36 of the register exchange. For example, if for the system (figure 5) it is necessary to make a permutation of p (O 2 4 ...
thirty
35
 24 26 1 3 .... 25 27 28 29 ... 5354), specified as one cycle, not as an image with two
50
55
lines, it can be decomposed by the standard action in the product (01 2 ... 25 26) (O 1 2 .... 53 54). The first permutation is implemented within three cycles (in one cycle, one basic permutation is realized). In the first cycle, permutations are implemented in parallel (O 1 23), (456 7), (8 9 10 11), (12 13 14 15), (16 17 18 19), (20 21 22 23), (24 25 26) - blocks 18 register exchange zero level. In the second
the tact is implemented in parallel with permutations (048 12), (16 20 24), which are implemented in blocks 18 of the first level register exchange. In the third cycle, a second level permutation (O 16) is implemented, and the numbers indicate the register numbers in blocks of 18 different levels corresponding to this device 16 of the entire system. The second permutation is also implemented within three cycles, with all full cycles from (O 1 2 3) to (48 49 50 51) being implemented during the first cycle, as is the cycle (52 53 54). In the second cycle, cycles (O 4 8 12), (16 20 24 28) (32 36 40 44), (48 52), and in the third cycle - a permutation (O 16 32 48). The sun permutation p is implemented within 6 cycles.
Computing systems can be created using various microprocessor sets while preserving the proposed organization.
If there is a fixed memory block 15 in the device 16, which is connected to the local address space of the device, the system (figure 2) turns into a SIMD / / MIMD system, i.e. in it, functional reconfiguration from one type to another is possible only depending on the address located in the program counter of the microprocessor 1 of this device 16. If it addresses the program located in this block 15, then the device 16 acts independently and independently of the others ( this is a MIMD system). It is possible that at the moment some of the devices 16 of the system (Fig. 2) operate according to their own programs, while others execute a common program recorded in a general operative memory unit 21. One device 16 switches from its own to the general program when the own program completes the transition to an address located outside the local address space of the device .16, and this address contains the general instructions on how to run it by several devices. The functional reconfiguration that is performed automatically is a significant advantage of the invention, especially since its implementation is simple. This
the possibility extends the range of use of the invention for various purposes, it increases the speed and saves the memory of the System.
Thus, the presence of block 15 in device 16 enables a hierarchical computing system,
designed as a MSIMD system, when programs are executed, it is automatically reconfigured functionally in MIMD, in S1MD or in MIMD system. This increases the efficiency of computations, since in some tasks the possible parallelism in solving them is insufficient for the load of all devices 16. In this case, some of the devices work according to their programs.
20
权利要求:
Claims (1)
[1]
Invention Formula
0
five
0
An intensifying system containing groups of computing modules 35, / 5 each TI3 of which contains a group of control and processing devices 16 and a storage device 20, 21, and in each computing module 35 of the group the command inputs-outputs of devices 16 of the control and processing group are connected to an information input- output memory device 20, 21 ,, characterized in that, in order to simplify the system, a register exchange block 18 and a stop block 23 are inserted into each group computation module 35, and in each computation module 35 The first information input-output of the register exchange unit 18 is the input-output exchange of the computing module 35, from the second to (p + 1) -th informational inputs-outputs (p is the number of control and processing devices in the group) and from the first on the p-th inputs of the exchange control unit of the register exchange 18 are connected to the exchange input-outputs and the exchange control outputs from the first to the pth control and processing devices of group 16, respectively, the address output of the first control and processing device 16 of the group is output of the address of the exchange Module 35 and is connected to the address inputs of the storage device 20, 21, the register exchange unit 18 and the stop unit 23, the outputs of which are connected to the inputs for setting the sleep mode so5
0
five
corresponding devices 16 for control and processing of the group and to the input of the exchange exchange register register 18, in addition n groups of register exchange blocks (where n is the number of hierarchical levels in the system), information inputs-outputs of the register exchange block 37 m 1, ..., p) are connected to the first information inputs-b1HODam, register exchange units (m-1) -th group, the rest are in
Formational inputs-outputs of which are connected to the first informational inputs-outputs of the register exchange blocks 36 (m-2) - group, the remaining informational inputs-outputs of each of which are connected to the inputs / outputs of the exchange of computing modules 35 of the corresponding groups Computing module- .neif 35 groups are connected to the inputs of the ad- and the address of blocks of the register exchange 36 and 37 of the corresponding groups.
OA 1
3 f
Ug Y /
Un
2
类似技术:
公开号 | 公开日 | 专利标题
SU1420601A1|1988-08-30|Computing system
US4509113A|1985-04-02|Peripheral interface adapter circuit for use in I/O controller card having multiple modes of operation
US3787818A|1974-01-22|Mult-processor data processing system
US3678467A|1972-07-18|Multiprocessor with cooperative program execution
JPH02168341A|1990-06-28|Data processing system
Händler et al.1976|A general purpose array with a broad spectrum of applications
US4731737A|1988-03-15|High speed intelligent distributed control memory system
JPS56114063A|1981-09-08|Multiprocessor
ES457282A1|1978-02-01|Programmable sequential logic
SU1072054A1|1984-02-07|Multiprocessor grate-controller
SU951315A1|1982-08-15|Device for interfacing processor with multi-unit memory
SU618733A1|1978-08-05|Microprocessor for data input-output
SU602950A1|1978-04-15|Serial-action computing system
SU1575168A1|1990-06-30|Device for isolation of median of three numbers
SU1277129A1|1986-12-15|Multiprocessor computer system
SU1123055A1|1984-11-07|Address unit for storage
SU751238A1|1983-07-15|Multiprocessor computer system
SU771655A1|1980-10-15|Volume control device
SU849219A1|1981-07-23|Data processing system
Händler et al.1975|Fitting processors to the needs of a general purpose array |
SU1569843A1|1990-06-07|Multicompressor computer system
JPS6048504A|1985-03-16|Connection system of sequence controller
SU1108460A1|1984-08-15|Device for solving differential equations
SU614432A1|1978-07-05|Telemechanics system-computer interfage
RU1798797C|1993-02-28|Multiprocessor system
同族专利:
公开号 | 公开日
DE3314917A1|1983-11-03|
NL8301477A|1983-11-16|
JPS5917657A|1984-01-28|
GB2122781B|1985-08-07|
GB2122781A|1984-01-18|
US4591981A|1986-05-27|
DK178983D0|1983-04-22|
FR2525787A1|1983-10-28|
DK178983A|1983-10-27|
FR2525787B3|1985-03-01|
GB8311311D0|1983-06-02|
IN157908B|1986-07-19|
BG35575A1|1984-05-15|
HU186323B|1985-07-29|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US3308436A|1963-08-05|1967-03-07|Westinghouse Electric Corp|Parallel computer system control|
US3753234A|1972-02-25|1973-08-14|Reliance Electric Co|Multicomputer system with simultaneous data interchange between computers|
GB1481393A|1974-02-28|1977-07-27|Burroughs Corp|Information processing systems|
US4149242A|1977-05-06|1979-04-10|Bell Telephone Laboratories, Incorporated|Data interface apparatus for multiple sequential processors|
US4247892A|1978-10-12|1981-01-27|Lawrence Patrick N|Arrays of machines such as computers|
DE2920994A1|1979-05-23|1980-11-27|Siemens Ag|DATA SEND / RECEIVER DEVICE WITH PARALLEL / SERIAL AND SERIAL / PARALLEL CHARACTERS CONVERSION, IN PARTICULAR FOR DATA EXCHANGE BETWEEN COMMUNICATING DATA PROCESSING SYSTEMS|
US4344134A|1980-06-30|1982-08-10|Burroughs Corporation|Partitionable parallel processor|
US4412285A|1981-04-01|1983-10-25|Teradata Corporation|Multiprocessor intercommunication system and method|GB2140943A|1983-06-03|1984-12-05|Burke Cole Pullman|Improvements relating to computers|
JPH0368429B2|1983-11-07|1991-10-28|Masahiro Sowa|
US5228127A|1985-06-24|1993-07-13|Fujitsu Limited|Clustered multiprocessor system with global controller connected to each cluster memory control unit for directing order from processor to different cluster processors|
US4855903A|1984-12-20|1989-08-08|State University Of New York|Topologically-distributed-memory multiprocessor computer|
US4827403A|1986-11-24|1989-05-02|Thinking Machines Corporation|Virtual processor techniques in a SIMD multiprocessor array|
JPH0787461B2|1987-06-19|1995-09-20|株式会社東芝|Local Area Network System|
AU1993088A|1987-06-19|1989-01-19|Human Devices, Inc.|Multiply-installable, multi-processor board for personal computer and workstation expansion buses|
JPH07104837B2|1987-11-25|1995-11-13|富士通株式会社|Processor control method|
FR2626091B1|1988-01-15|1994-05-06|Thomson Csf|HIGH POWER COMPUTER AND COMPUTING DEVICE COMPRISING A PLURALITY OF COMPUTERS|
EP0378115B1|1989-01-06|1998-09-30|Hitachi, Ltd.|Neural computer|
JPH0550018B2|1988-05-31|1993-07-27|Fujitsu Ltd|
US5111423A|1988-07-21|1992-05-05|Altera Corporation|Programmable interface for computer system peripheral circuit card|
US5136717A|1988-11-23|1992-08-04|Flavors Technology Inc.|Realtime systolic, multiple-instruction, single-data parallel computer system|
US5218709A|1989-12-28|1993-06-08|The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration|Special purpose parallel computer architecture for real-time control and simulation in robotic applications|
EP0509055A4|1990-01-05|1994-07-27|Maspar Computer Corp|Parallel processor memory system|
JPH07122866B1|1990-05-07|1995-12-25|Mitsubishi Electric Corp|
US5355508A|1990-05-07|1994-10-11|Mitsubishi Denki Kabushiki Kaisha|Parallel data processing system combining a SIMD unit with a MIMD unit and sharing a common bus, memory, and system controller|
EP0485594A4|1990-05-30|1995-02-01|Adaptive Solutions Inc|Mechanism providing concurrent computational/communications in simd architecture|
EP0471928B1|1990-08-20|1999-07-14|Kabushiki Kaisha Toshiba|Connection state confirmation system and method for expansion unit|
EP0485690B1|1990-11-13|1999-05-26|International Business Machines Corporation|Parallel associative processor system|
US5175858A|1991-03-04|1992-12-29|Adaptive Solutions, Inc.|Mechanism providing concurrent computational/communications in SIMD architecture|
US5361367A|1991-06-10|1994-11-01|The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration|Highly parallel reconfigurable computer architecture for robotic computation having plural processor cells each having right and left ensembles of plural processors|
WO1993011503A1|1991-12-06|1993-06-10|Norman Richard S|Massively-parallel direct output processor array|
CA2078912A1|1992-01-07|1993-07-08|Robert Edward Cypher|Hierarchical interconnection networks for parallel processing|
JP3290798B2|1994-03-14|2002-06-10|富士通株式会社|Parallel computer|
US6408402B1|1994-03-22|2002-06-18|Hyperchip Inc.|Efficient direct replacement cell fault tolerant architecture|
DE69519426T2|1994-03-22|2001-06-21|Hyperchip Inc|Cell-based fault-tolerant architecture with advantageous use of the unallocated redundant cells|
JPH08249254A|1995-03-15|1996-09-27|Mitsubishi Electric Corp|Multicomputer system|
US5630161A|1995-04-24|1997-05-13|Martin Marietta Corp.|Serial-parallel digital signal processor|
US5649179A|1995-05-19|1997-07-15|Motorola, Inc.|Dynamic instruction allocation for a SIMD processor|
JPH09190423A|1995-11-08|1997-07-22|Nkk Corp|Information processing unit, information processing structure unit, information processing structure body, memory structure unit and semiconductor storage device|
US5903771A|1996-01-16|1999-05-11|Alacron, Inc.|Scalable multi-processor architecture for SIMD and MIMD operations|
US6079008A|1998-04-03|2000-06-20|Patton Electronics Co.|Multiple thread multiple data predictive coded parallel processing system and method|
GB2399190B|2003-03-07|2005-11-16| Zarlink Semiconductor Limited|Parallel processing architecture|
US20040255096A1|2003-06-11|2004-12-16|Norman Richard S.|Method for continuous linear production of integrated circuits|
US8755515B1|2008-09-29|2014-06-17|Wai Wu|Parallel signal processing system and method|
法律状态:
优先权:
申请号 | 申请日 | 专利标题
BG8256357A|BG35575A1|1982-04-26|1982-04-26|Multimicroprocessor system|
[返回顶部]